Skip to content

Conversation

caitriona-cloudsmith
Copy link
Collaborator

No description provided.

paddycarey and others added 30 commits December 16, 2024 22:45
Specifically, toStartOfMinute, toStartOfFiveMinutes, toStartOfTenminutes, toStartOfFifteenMinutes and toStartOfHour
jayvynl and others added 7 commits June 5, 2025 13:04
* Handle distributed migrations

* add extra test and update README.md
* Demonstrating breaking connection

* Stop reusing broken connection

* Added changelog and explanatory comment to test
@Copilot Copilot AI review requested due to automatic review settings September 19, 2025 10:21
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Pull in upstream features and compatibility updates focused on distributed migrations, columnar/numpy query support, and broader Python/Django version coverage, plus tooling and docs updates.

  • Add distributed migrations support (including MigrationRecorder patches and tests) and HAProxy setup for load-balanced clusters.
  • Enable columnar and NumPy result handling via a cursor context manager; fix pool behavior when iteration is interrupted.
  • Expand supported/tested versions (Django 5.2, Python 3.13), update CI, tox, docs, and changelog.

Reviewed Changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
tox.ini Expand env matrix, add deps tweaks and lint/format updates
tests/settings.py Add PASSWORD for ClickHouse test connections
tests/migrations/test_loader.py Add distributed migration tests
tests/clickhouse_functions/test_datetime.py Add/adjust datetime function tests
tests/clickhouse_fields/tests.py Normalize IPv6 mapped IPv4 test via ipaddress
tests/backends/tests.py Columnar/NumPy cursor tests and iterator interruption behavior
tests/backends/clickhouse/test_driver.py Pool size test and iterator interruption test
tests/aggregation/tests.py Aggregator arity customization for test
pyproject.toml Add classifiers for Django 5.x and Python 3.13
proxy-config/haproxy.cfg New HAProxy config for load-balanced ClickHouse
docs/Configurations.md Document max_block_size and changes
compose.yaml Add passwords, expose HTTP ports, and HAProxy service
clickhouse_backend/patch/migrations.py Patch MigrationRecorder and Migration apply/unapply for distributed migrations
clickhouse_backend/models/sql/compiler.py Remove custom field_as_sql override
clickhouse_backend/models/functions/datetime.py Add toStartOf* and toYearWeek functions
clickhouse_backend/models/fields/tuple.py Replace deprecated is_iterable with collections.abc.Iterable
clickhouse_backend/models/fields/array.py Replace deprecated is_iterable with collections.abc.Iterable
clickhouse_backend/models/fields/init.py Replace deprecated is_iterable and minor checks
clickhouse_backend/models/engines.py Replace deprecated is_iterable and normalize iterable checks
clickhouse_backend/driver/pool.py Pool push logic to avoid reusing executing connections; arg handling
clickhouse_backend/driver/connection.py Add columnar/NumPy execution toggles and cursor preparation
clickhouse_backend/backend/client.py Allow passing DSN to clickhouse-client
clickhouse_backend/backend/base.py Add distributed_migrations and max_block_size; chunked_cursor
clickhouse_backend/init.py Bump version tuple
clickhouse-config/node[1-4]/remote-servers.xml Add password entries for replicas
README.md Document columnar/NumPy support and distributed migrations config
CONTRIBUTING.md Add changelog reminder
CHANGELOG.md Add entries for 1.3.2, 1.4.0, 1.5.0
.github/workflows/release.yml Upgrade artifact and sigstore actions
.github/workflows/base-test.yml Expand matrix to Python 3.13/Django 5.2 and runner pin
Comments suppressed due to low confidence (4)

clickhouse_backend/patch/migrations.py:1

  • [nitpick] Suggested fix for the two issues above: parameterize the cluster name and compute the replica count. Using >= 2 ensures we're only treating clusters with actual replication as replicated.
from django.apps.registry import Apps

clickhouse_backend/driver/connection.py:1

  • [nitpick] Proposed change using try/finally to ensure deterministic teardown even on exceptions.
import re

tests/migrations/test_loader.py:1

  • [nitpick] Suggested parameterized form for the query in tests.
import compileall

clickhouse_backend/patch/migrations.py:1

  • [nitpick] Suggested improvement to reference the actual table being created in the error message.
from django.apps.registry import Apps

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


with connection.cursor() as cursor:
cursor.execute(
f"select replica_num from system.clusters where cluster={connection.migration_cluster}"
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe string interpolation in SQL and missing quoting for a string literal. Use a parameterized query to avoid SQL injection and handle string values correctly.

Suggested change
f"select replica_num from system.clusters where cluster={connection.migration_cluster}"
"select replica_num from system.clusters where cluster=%s",
[connection.migration_cluster],

Copilot uses AI. Check for mistakes.

Comment on lines +48 to +51
f"select replica_num from system.clusters where cluster={connection.migration_cluster}"
)
(replica_count,) = cursor.fetchone()
return replica_count >= 1
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query fetches a single row's replica_num instead of the replica count, so has_replicas may be incorrect. Count replicas for the cluster and compare the count rather than reading one row.

Suggested change
f"select replica_num from system.clusters where cluster={connection.migration_cluster}"
)
(replica_count,) = cursor.fetchone()
return replica_count >= 1
f"SELECT count(*) FROM system.clusters WHERE cluster = '{connection.migration_cluster}'"
)
(replica_count,) = cursor.fetchone()
return replica_count >= 2 # At least 2 replicas means there are replicas

Copilot uses AI. Check for mistakes.

Comment on lines +128 to +132
yield self

self.use_numpy = original_use_numpy
self.columnar = original_columnar

Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State restoration won't run if an exception is raised inside the context. Wrap the yield in a try/finally to guarantee settings are reverted.

Suggested change
yield self
self.use_numpy = original_use_numpy
self.columnar = original_columnar
try:
yield self
finally:
self.use_numpy = original_use_numpy
self.columnar = original_columnar

Copilot uses AI. Check for mistakes.

def assertMigrationExists(self, conn, name, app, deleted=False):
with conn.cursor() as cursor:
cursor.execute(
f"SELECT * FROM distributed_django_migrations where name = '{name}'"
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interpolating the migration name directly into SQL is unsafe. Parameterize the query to avoid SQL injection risks.

Suggested change
f"SELECT * FROM distributed_django_migrations where name = '{name}'"
"SELECT * FROM distributed_django_migrations WHERE name = %s", [name]

Copilot uses AI. Check for mistakes.

editor.create_model(self.Migration)
except DatabaseError as exc:
raise MigrationSchemaMissing(
"Unable to create the django_migrations table (%s)" % exc
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message hardcodes 'django_migrations' even when using a distributed table. Use the actual table name to avoid confusion.

Suggested change
"Unable to create the django_migrations table (%s)" % exc
"Unable to create the %s table (%s)" % (self.Migration._meta.db_table, exc)

Copilot uses AI. Check for mistakes.

Comment on lines +12 to +13
server ch1 clickhouse-node1:9000 check
server ch2 clickhouse-node3:9000 check
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server hostnames don't match the docker-compose service names. In compose, services are named 'node1' and 'node3', so HAProxy should target 'node1'/'node3' instead of 'clickhouse-node1'/'clickhouse-node3'.

Copilot uses AI. Check for mistakes.

Comment on lines +25 to +26
server ch1 clickhouse-node1:8123 check
server ch2 clickhouse-node3:8123 check
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server hostnames don't match the docker-compose service names. In compose, services are named 'node1' and 'node3', so HAProxy should target 'node1'/'node3' instead of 'clickhouse-node1'/'clickhouse-node3'.

Copilot uses AI. Check for mistakes.

@@ -1,5 +1,5 @@
from clickhouse_backend.utils.version import get_version

VERSION = (1, 3, 1, "final", 0)
VERSION = (1, 4, 0, "final", 0)
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version tuple indicates 1.4.0 while CHANGELOG documents 1.5.0 at the top. Align the package version with the latest documented release (or adjust the changelog) to avoid release confusion.

Suggested change
VERSION = (1, 4, 0, "final", 0)
VERSION = (1, 5, 0, "final", 0)

Copilot uses AI. Check for mistakes.

### 1.4.0

- feat: #119 Allow query results returned in columns and deserialized to `numpy` objects
- feat: #125 Add database functions `toStartOfMinute`, `toStartOfFiveMinutes`, `toStartOfTenMinutes`, `toStartOfFifteenMinutes` and `toStartofHour`
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'toStartofHour' should be 'toStartOfHour' (capital 'O').

Suggested change
- feat: #125 Add database functions `toStartOfMinute`, `toStartOfFiveMinutes`, `toStartOfTenMinutes`, `toStartOfFifteenMinutes` and `toStartofHour`
- feat: #125 Add database functions `toStartOfMinute`, `toStartOfFiveMinutes`, `toStartOfTenMinutes`, `toStartOfFifteenMinutes` and `toStartOfHour`

Copilot uses AI. Check for mistakes.


### 1.3.2

- feat(aggragation-function): add anyLast function.
Copy link
Preview

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'aggragation' should be 'aggregation'.

Suggested change
- feat(aggragation-function): add anyLast function.
- feat(aggregation-function): add anyLast function.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants